Skip to content

GH-45819: [C++] Add OptionalBitmapAnd utility function #45869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Mar 20, 2025

Rationale for this change

Bitmaps are optional, having the possibility an utility function that allows us to do a BitmapAnd operation that takes into account the possibility of a bitmap being null can be handy as we don't have to cater for the specific case.

What changes are included in this PR?

Add a new OptionalBitmapAnd function.

Are these changes tested?

Yes.

Are there any user-facing changes?

No, just a new utility function.

@raulcd
Copy link
Member Author

raulcd commented Mar 20, 2025

@github-actions crossbow submit -g cpp

Copy link

Revision: 4188841

Submitted crossbow builds: ursacomputing/crossbow @ actions-fd8ca4245c

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-meson GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-39-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@raulcd
Copy link
Member Author

raulcd commented Mar 20, 2025

CI failures are unrelated. Happy to rebase once they are fixed. @pitrou is something like this what you had in mind?

@raulcd raulcd requested a review from pitrou March 20, 2025 15:26
/// their respective bit-offsets for the given bit-length and put
/// the results in out_buffer starting at the given bit-offset.
/// Both right and left buffers are optional. If any of the buffers is
/// null a bitmap of zeros is returned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, actually, the more useful semantics is that a null pointer means the bitmap is all-1s.

This reflects the situation where an Array doesn't have a null bitmap: all values are valid.

Copy link
Member

@pitrou pitrou Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this would be even more useful as:

Result<std::shared_ptr<Buffer>> OptionalBitmapAnd(
    MemoryPool* pool, const std::shared_ptr<Buffer>& left, int64_t left_offset,
    const std::shared_ptr<Buffer>& right, int64_t right_offset, int64_t out_offset);

... because then, if one of the inputs is null, and the offsets are compatible, we can return the other input (perhaps sliced) instead of allocating a new buffer.

@raulcd raulcd marked this pull request as draft March 24, 2025 14:57
@pitrou
Copy link
Member

pitrou commented Jun 12, 2025

@raulcd Would you like to revisit this?

@raulcd
Copy link
Member Author

raulcd commented Jun 16, 2025

I re-started to try and adapt with the new APIs but I am finding adapting the existing tests for the new APIs slightly challenging, I will give that a try again but it is taking me quite a long time, this is the diff for my new branch:
main...raulcd:arrow:GH-45819-2

@raulcd raulcd marked this pull request as ready for review June 17, 2025 12:15
@raulcd raulcd requested a review from pitrou June 17, 2025 12:16
@raulcd
Copy link
Member Author

raulcd commented Jun 17, 2025

@pitrou this is currently working with the proposed API. I had to do several changes on the tests in order to adapt the existing TestAligned and TestUnaligned functions. I will appreciate any tip on improving those changes.


for (int64_t left_offset : {0, 1, 3, 5, 7, 8, 13, 21, 38, 75, 120, 65536}) {
BitmapFromVector(left_bits, left_offset, &left, &length);
if (left_bits.size() > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is putting too much logic in these test methods. I would suggest something else:

  1. test OptionalBitmapAnd with both non-null arguments like you do here
  2. have separate tests for when one or both of the arguments is null

Comment on lines +406 to +408
if (right_offset == out_offset) {
return right;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could easily slice the input buffer if both offsets are equal modulo 8. It would avoid copying in slightly more cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a dedicated function ViewOrCopyBitmap by the way. Search through the source code would probably find other places where such a function could be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants